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Amino Acid 


cca Pro(P) 
ccc Pro(P) 
ccg Pro(P) 
ecu Pro(P) 


age Ser(S) 
agu Ser(S) 
uca Ser(S) 
ucc Ser(S) 
ucg Ser(S) 

UCU bCY (p) 


aca Thr(T) 
acc Thr(T) 
acg Thr(T) 
acu Thr(T) 


ugg Trp(W) 


uac Tyr(Y) 
uau Tyr(Y) 
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Amino Acid 


gga Gly(G) 
ggc Gly(G) 
ggg Gly(G) 
ggu Gly(G) 


cac Hi s(H) 
cau Hi s(H) 


aua Ile(I) 
auc Ile(I) 
auu Ile(I) 


cua Leu(L) 
cue Leu(L) 
cug Leu(L) 
cuu Leu(L) 
uua Leu(L) 
uug Leu(L) 


aaa Lys (K) 
aag Lys (K) 


aug Met(M) 


uuc Phe(F) 
uuu Phe (F) 
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Amino Acid 


gca Ala(A) 
gcc Ala(A) 
gcg Ala(A) 
gcu Ala(A) 


aga Arg(R) 
agg Arg(R) 
cga Arg(R) 
cgc Arg(R) 
egg Arg(R) 
cgu Arg(R) 


aac Asn(N) 
aau Asn(N) 


gac Asp(D) 
gau Asp(D) 


ugc Cys (C) 
ugu Cys(C) 


caa Gln(Q) 
cag Gln(Q) 


gaa Glu(E) 
gag Glu(E) 
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• Regulates HIV gene expression by promoting 
cytoplasmic levels of unspliced and singly spliced 
mRNAs 

• Postulated to affect splicing, stability, transport, 
and translation 



Fig. 4 



Codon Optimization of HIV gagpol 

• Remove A-rich instability elements 

• Improve translational efficiency 

• Reduce risk of recombination with 

transfer vector 



Fig. 5 
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Alignment Report of Codon optimization (gag). MEG, using Clustal method with PAM250 residue weight table. 
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810 
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792 MGARASVLSGGELDK NL4-3 genbank.SEQ 

792 ATG GGT GCG AGA GCG TCG GTA TTA AGC GGG GGA GAA TTA GAT AAA 

1319 MGARASVLSGGELDK pHDMHgpm2 . seq 

1319 ATG GGC GCC CGC GCC TCC GTG CTG TCC GGC GGC GAG CTG GAC AAG 

1 "I — 

840 870 

I 1 

837 WEKIRLRPGGKKQYK NL4-3 genbank.SEQ 

837 TGG GAA AAA ATT CGG TTA AGG CCA GGG GGA AAG AAA CAA TAT AAA 

1364 WEKIRLRPGGKKQYK pHDMHgpm2 . seq 
1364 TGG GAG AAG ATC CGC CTG CGC CCC GGC GGC AAG AAG CAG TAC AAG 

1 ■ — 

900 

. i 

882 LKH IVWAS RELERFA NL4-3 genbank.SEQ 
882 CTA AAA CAT ATA GTA TGG GCA AGC AGG GAG CTA GAA CGA TTC GCA 

1409 LKHIVWAS RELERFA pHDMHgpm2 . seq 
1409 CTG AAG CAC ATC GTG TGG GCC TCC CGC GAG CTG GAG CGC TTC GCC 

1 1 

930 960 

I 1 

927 VNPGLLET S EGCRQI NL4-3 genbank.SEQ 

927 GTT AAT CCT GGC CTT TTA GAG ACA TCA GAA GGC TGT AGA CAA ATA 

1454 VNPGLLET SEGCRQI pHDMHgpm2 . seq 
1454 GTG AAC CCC GGC CTG CTG GAG ACC TCC GAG GGC TGC CGC CAG ATC 

990 

i . 

972 LGQLQPS LQTGSEEL NL4-3 genbank.SEQ 
972 CTG GGA CAG CTA CAA CCA TCC CTT CAG ACA GGA TCA GAA GAA CTT 

1499 LGQLQPS LQTGSEEL pHDMHgpm2 . seq 
1499 CTG GGC CAG CTG CAG CCC TCC CTG CAA ACC GGC TCC GAG GAG CTG 

1 1 

1020 1050 

! 1 

1017 RSLYNTIAVLYCVHQ NL4-3 genbank.SEQ 
1017 AGA TCA TTA TAT AAT ACA ATA GCA GTC CTC TAT TGT GTG CAT CAA 

1544RS L YN T I A V L Y CVH Q pHDMHgpm2 . seq 
1544 CGC TCC CTG TAC AAC ACC ATC GCC GTG CTG TAC TGC GTG CAC CAG 

— . T — ■ 

1080 

1 

1062 RI DVKDT KEALDKI E NL4-3 genbank.SEQ 
1062 AGG ATA GAT GTA AAA GAC ACC AAG GAA GCC TTA GAT AAG ATA GAG 

1589 RIDVKDT KEALDKIE pHDMHgpm2 . seq 
1589 CGC ATC GAC GTG AAG GAC ACC AAG GAG GCC CTG GAC AAG ATC GAG 

, 1 

1110 1140 

I 1 

1107 EEQNKS KKKAQQAAA NL4-3 genbank.SEQ 

1107 GAA GAG CAA AAC AAA AGT AAG AAA AAG GCA CAG CAA GCA GCA GCT 

1634 EEQNKS KKKAQQAAA pHDMHgpm2 . seq 
1634 GAG GAG CAG AAC AAG TCC AAG AAG AAG GCC CAG CAG GCC GCC GCC 



Fig. 8A 



Alignment Report of Codon optimization (gag).MEG, using Clustal method with PAM250 residue weight table. 
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1152 DTGNNSQVSQNYPIV NL4-3 genbank.SEQ 

1152 GAC ACA GGA AAC AAC AGC CAG GTC AGC CAA AAT TAC CCT ATA GTG 

1679 DTGNNSQVSQNYPIV pHDMHgpm2 . seq 

1679 GAC ACC GGC AAC AAC TCC CAG GTG TCC CAG AAC TAC CCC ATC GTG 

1 ] 

1200 1230 

I ! 

1197 QNLQGQMVHQAI S PR NL4-3 genbank.SEQ 

1197 CAG AAC CTC CAG GGG CAA ATG GTA CAT CAG GCC ATA TCA CCT AGA 

1724 QNLQGQMVHQAI S PR pHDMHgpm2 . seq 
1724 CAG AAC CTG CAG GGC CAG ATG GTG CAC CAG GCC ATC TCC CCC CGC 

1 ~ 

1260 

I . - 

1242 TLNAWVKVVEEKAFS NL4-3 genbank.SEQ 

1242 ACT TTA AAT GCA TGG GTA AAA GTA GTA GAA GAG AAG GCT TTC AGC 

1769 TLNAWVKVVEEKAFS pHDMHgpm2 . seq 
1769 ACC CTG AAC GCC TGG GTG AAG GTG GTG GAG GAG AAG GCC TTC TCC 

1 , 

1290 1320 
I l 

1287 PEVIPMFSALSEGAT NL4-3 genbank.SEQ 

1287 CCA GAA GTA ATA CCC ATG TTT TCA GCA TTA TCA GAA GGA GCC ACC 

1814 PEVIPMFSALSEGAT pHDMHgpm2 . seq 

1814 CCC GAA GTC ATC CCC ATG TTC TCC GCC CTG TCC GAG GGC GCC ACC 



i 

1350 



1332 PQDLNTMLNTVGGHQ NL4-3 genbank.SEQ 

1332 CCA CAA GAT TTA AAT ACC ATG CTA AAC ACA GTG GGG GGA CAT CAA 

1859 PQDLNTMLNTVGGHQ pHDMHgpm2 . seq 

1859 CCC CAG GAC CTG AAC ACC ATG CTG AAC ACC GTG GGC GGC CAC CAG 

1 1 

1380 1410 

I i 

1377 AAMQMLKET INEEAA NL4-3 genbank.SEQ 

1377 GCA GCC ATG CAA ATG TTA AAA GAG ACC ATC AAT GAG GAA GCT GCA 

1904 AAMQML K E T I N E E A A pHDMHgpm2 . seq 
1904 GCC GCC ATG CAG ATG CTG AAG GAG ACC ATC AAC GAG GAG GCC GCC 

1 

1440 

I 

1422 EWDRLH PVHAGPIAP NL4-3 genbank.SEQ 

1422 GAA TGG GAT AGA TTG CAT CCA GTG CAT GCA GGG CCT ATT GCA CCA 

1949 EWDRLH PVHAGPIAP pHDMHgpm2 . seq 
1949 GAG TGG GAC CGC CTG CAC CCC GTG CAC GCC GGC CCC ATC GCC CCC 

1 1 

1470 1500 

l I 

1467 GQMREP RG S DIAGTT NL4-3 genbank.SEQ 

1467 GGC CAG ATG AGA GAA CCA AGG GGA AGT GAC ATA GCA GGA ACT ACT 

1994 GQMREP RG S DIAGTT pHDMHgpm2 . seq 
1994 GGC CAG ATG CGC GAG CCC CGC GGC TCC GAC ATC GCC GGC ACC ACC 



Fig, 8B 



Alignment Report of Codon optimization (gag).MEG, using Ctustal method with PAM250 residue weight table. 
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1512 STLQEQIGWMTHNPP NL4-3 genbank.SEQ 
1512 AGT ACC CTT CAG GAA CAA ATA GGA TGG ATG ACA CAT AAT CCA CCT 

2039STLQEQI GWMTHNPP pHDMHgpm2 . seq 
2039 TCC ACC CTG CAA GAG CAG ATC GGC TGG ATG ACC CAC AAC CCC CCC 

1 1 

1560 1590 
l I 

1557 IPVGEIYKRWI ILGL NL4-3 genbank.SEQ 

1557 ATC CCA GTA GGA GAA ATC TAT AAA AGA TGG ATA ATC CTG GGA TTA 

2084 IPVGEIYKRWI ILGL pHDMHgpm2 . seq 

208 4 ATC CCC GTG GGC GAG ATC TAC AAG CGC TGG ATC ATC CTG GGC CTG 

1 : 

1620 

I 

1602 NKIVRMYS PTSILDI NL4-3 genbank.SEQ 

1602 AAT AAA ATA GTA AGA ATG TAT AGC CCT ACC AGC ATT CTG GAC ATA 

2129 NKIVRMY S PTSILDI pHDMHgpm2 . seq 
2129 AAC AAG ATC GTG CGC ATG TAC , TCC CCC ACC TCC ATC CTG GAC ATC 



1650 1680 
I I 

1647 RQGPKEP FRDYVDRF NL4-3 genbank.SEQ 
1647 AGA CAA GGA CCA AAG GAA CCC TTT AGA GAC TAT GTA GAC CGA TTC 

2174RQ G P K E P F R D YVD R F pHDMHgpm2 . seq 
2174 CGC CAG GGC CCC AAG GAG CCC TTC CGC GAC TAC GTG GAC CGC TTC 

1 

1710 

I 

1692 YKTLRAEQASQEVKN NL4-3 genbank.SEQ 

1692 TAT AAA ACT CTA AGA GCC GAG CAA GCT TCA CAA GAG GTA AAA AAT 

2219 YKTLRAEQASQEVKN pHDMHgpm2 . seq 
2219 TAC AAG ACC CTG CGC GCC GAG CAG GCC TCC CAG GAG GTA AAG AAC 

1 1 

1740 1770 

I 1 

1737 WMTETLLVQNANPDC NL4-3 genbank.SEQ 

1737 TGG ATG ACA GAA ACC TTG TTG GTC CAA AAT GCG AAC CCA GAT TGT 

22 64 WMT ET L LVQ NAN P D C pHDMHgpm2 . seq 
22 64 TGG ATG ACC GAG ACC CTG CTG GTG CAG AAC GCC AAC CCC GAC TGC 

1 

1800 

1782 KTI LKALG P GATLEE NL4-3 genbank.SEQ 
1782 AAG ACT ATT TTA AAA GCA TTG GGA CCA GGA GCG ACA CTA GAA GAA 

2309 KTI LKALGP GATLEE pHDMHgpm2 . seq 
2309 AAG ACC ATC CTG AAG GCC CTG GGC CCC GGC GCC ACC CTG GAG GAG 

, 1 

1830 1860 

I 1 

1827 MMTACQGVGGPGHKA NL4-3 genbank.SEQ 

1827 ATG ATG ACA GCA TGT CAG GGA GTG GGG GGA CCC GGC CAT AAA GCA. 

2354 MMTACQGVGGP GHKA pHDMHgpm2 . seq 
2354 ATG ATG ACC GCC TGC CAG GGC GTG GGC GGC CCC GGC CAC AAG GCC 



Fig. 8C 



Alignment Report of Codon optimization (gag). MEG, using Clustal method with PAM250 residue weight table. 
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1872 RVLAEAMS QVTNPAT NL4-3 genbank.SEQ 

1872 AGA GTT TTG GCT GAA GCA ATG AGC CAA GTA ACA AAT CCA GCT ACC 

2399 RVLAEAMS QVTNPAT pHDMHgpm2 . seq 

2399 CGC GTG CTG GCC GAG GCC ATG TCC CAA GTC ACC AAC CCC GCC ACC 

1 1 

1920 1950 
i i 

1917 IMIQKGNFRNQRKTV NL4-3 genbank.SEQ 

1917 ATA ATG ATA CAG AAA GGC AAT TTT AGG AAC CAA AGA AAG ACT GTT 

2444 IMIQKGNFRNQRKTV pHDMHgpm2 . seq 

2444 ATC ATG ATC CAG AAG GGC AAC TTC CGC AAC CAG CGC AAG ACC GTG 

1 

1980 

I 

1962 KCFNCGKEGHIAKNC NL4-3 genbank.SEQ 
1962 AAG TGT TTC AAT TGT GGC AAA GAA GGG CAC ATA GCC AAA AAT TGC 

2489 KCFNCGKEGHIAKNC pHDMHgpm2 . seq 
248 9 AAG TGC TTC AAC TGC GGC AAG GAG GGC CAC ATC GCC AAG AAC TGC 

1 i 

2010 2040 
I I 



2007 RAPRKKGCWKCGKEG NL4-3 genbank.SEQ 

2007 AGG GCC CCT AGG AAA AAG GGC TGT TGG AAA TGT GGA AAG GAA GGA 

2534 RAPRKKGCWKCGKEG pHDMHgpin2 . seq 

2534 CGC GCC CCC CGC AAG AAG GGC TGC TGG AAG TGC GGC AAG GAG GGC 

1 

2070 

I 

2052 HQMKDCTERQANFLG NL4-3 genbank.SEQ 
2052 CAC CAA ATG AAA GAT TGT ACT GAG AGA CAG GCT AAT TTT TTA GGG 

2579 HQMKDCTERQANFLG pHDMHgpm2 . seq 
2519 CAC CAG ATG AAA GAT TGT ACT GAG AGA CAG GCT AAT TTT TTA GGG 



1 1 

2100 2130 
I I 

2097 KIWPSHK GRPGNFLQ NL4-3 genbank.SEQ 

2 097 AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 

2624 KIWPSHKGRPGNFLQ pHDMHgpm2 . seq 

262 4 AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 

1 

2160 

I 

2142 SRPEPTAP PEESFRF NL4-3 genbank.SEQ 
2142 AGC AGA CCA GAG CCA ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 

2669 SRPEPTAPPEESFRF pHDMHgpm2 . seq 
2669 AGC AGA CCA GAG CCA ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 

j 1 

2190 2220 
1 I 

2187 GEETTTPSQKQEPID NL4-3 genbank.SEQ 

2187 GGG GAA GAG ACA ACA ACT CCC TCT CAG AAG CAG GAG CCG ATA GAC 

2714 GEETTTPSQKQEPID pHDMHgpm2 . seq 

2714 GGG GAA GAG ACA ACA ACT CCC TCT CAG AAG CAG GAG CCG ATA GAC 



Fig. 8D 



Alignment Report of Codon optimization (gag).MEG, using Clustal method with PAM250 residue weight table. 
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2232 KELYPLASLRSLFGS NL4-3 genbank.SEQ 

2232 AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC 

2759 KELYPLASLRSLFGS pHDMHgpm2 . seq 
2759 AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC 

1 

2280 

1 

2277 D P S S Q 

2277 GAC CCC TCG TCA CAA<;TAA 
2804 D P S S Q ;|ff§! 
2804 GAC CCC TCG TCA CAA •,TAA: 



NL4-3 genbank.SEQ 
pHDMHgpm2 .seq 



Fig. 8E 



Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 



2087 
2087 
2085 
2085 
2612 



2177 
2177 
2175 
2175 
2702 
2702 



2222 
2222 
2220 
2220 
2747 
2747 



2267 
2267 
2265 
2265 
2792 
2792 



1 

2090 
t 



1 — 

2120 
i 



D 



K 



E NL4-3 genbank.SEQ 



TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 



D 



K 



E pNL4-3.seq 



TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 



K 



R E pHDMHgpm2 . seq 



2612 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 

— r~ ~ 

2150 

i . 



2312 
2312 
2310 
2310 
2837 
2837 



2132 FSSEQTRANS PTRRE 

2132 TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 

2130 FS SEQ. TRANS PTRRE 

2130 TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 

2657 FSSEQTRANS PTRRE 

2 657 TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 



1 

2180 
I 



T 



2210 
I 



LQVWGRDNNS LSEAG 
CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 
LQVWGRDNNSLSEAG 

CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 

LQVWGRDNNSLSEAG 
CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 



1 

2240 
l 



ADRQGTVS FS FPQIT 

GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 

ADRQGTVS FS FPQ IT 

GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 

ADRQGT VS FS FPQIT 

GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 



1 

2270 
i 



1 — 

2300 
i 



lwqrplvt ikiggql 
ctt tgg cag cga ccc ctc gtc aca ata aag ata ggg ggg caa tta 
lwqrplvtikiggql 

ctt tgg cag cga ccc ctc gtc aca ata aag ata ggg ggg caa tta 

lwqrplvtikiggql 
ctt tgg cag cga ccc ctc gtc aca ata aag atc ggt ggc cag ctg 



1 

2330 
I 



KEALLDTGADDTVLE 

AAG GAA GCT CTA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 
KEALLDTGADDTVLE 

AAG GAA GCT CTA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 

KEALLDTGADDTVLE 
AAG GAG GCC CTG CTG GAC ACC GGC GCC GAC GAC ACC GTG CTG GAG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 
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Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 



2357 
2357 
2355 
2355 
2882 
2882 



2402 
2402 
2400 
2400 
2927 
2927 



2447 
2447 
2445 
2445 
2972 
2972 



2492 
2492 
2490 
2490 
3017 
3017 



1 

2360 
» 



i 

2390 
I 



EMNLPGRWKPKMI GG 
GAA ATG AAT TTG CCA GGA AGA TGG AAA CCA AAA ATG ATA GGG GGA 

EMNLPGRWKPKMIGG 
GAA ATG AAT TTG CCA GGA AGA TGG AAA CCA AAA ATG ATA GGG GGA 

EMNLPGRWKPKMI GG 
GAG ATG AAC CTG CCC GGC CGC TGG AAG CCC AAG ATG ATC GGC GGC 

1 



2420 
l 



K 



V ^::'G: 



D 



ATT GGA GGT TTT ATC AAA GTA GGA' CAG TAT GAT CAG ATA CTC ATA 



K 



V 



D 



ATT GGA GGT TTT ATC AAA GTA AGA CAG TAT GAT CAG ATA CTC ATA 

IGGFI KVRQYDQILI 
ATC GGC GGC TTC ATC AAA GTC CGC CAG TAC GAC CAG ATC CTG ATC 



1 

2450 



1 — 

2480 



EICGHKAI GTVLVGP 
GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 

EICGHKAI GTVLVGP 
GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 

EI CGH KAI GTVLVGP 
GAG ATC TGC GGC CAC AAG GCC ATC GGC ACC GTG CTG GTG GGC CCC 

1 ~ 

2510 

I 



V 



N 



ACA CCT GTC AAC ATA ATT 



V 



N 



ACA CCT GTC AAC ATA ATT 



V 



N 



G R N 
GGA AGA AAT 

G R N 
GGA AGA AAT 

G R N 



ACC CCC GTG AAC ATC ATC GGC CGC AAC 



L L T Q I G 

CTG TTG ACT CAG ATT GGC 

L L T Q I G 

CTG TTG ACT CAG ATT GGC 

L L T Q I G 

CTG CTG ACC CAG ATC GGC 



1 

2540 
I 



1 

2570 
I 



CTLNFPISPIETVPV 
TGC ACT TTA AAT TTT CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 

CTLNFPISPIETVPV 
TGC ACT TTA AAT TTT CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 
CTLNFPISPIETVPV 
3062 TGC ACC CTG AAC TTC CCC ATC TCC CCC ATC GAG ACC GTG CCC GTG 

1 



2537 
2537 
2535 
2535 
3062 



2600 
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2582 
2582 
2580 
2580 
3107 
3107 



K L K P G M D 

AAA TTA AAG CCA GGA ATG GAT 

K L K P G M D 

AAA TTA AAG CCA GGA ATG GAT 

K L K P G M D 

AAG CTG AAG CCC GGC ATG GAC 



GPKVKQWP 
GGC CCA AAA GTT AAA CAA TGG CCA 

GPKVKQWP 
GGC CCA AAA GTT AAA CAA TGG CCA 

GPKVKQWP 
GGC CCC AAA GTC AAG CAG TGG CCC 
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Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 
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2660 
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LTEEKI KALVEICTE 
TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

LTEEKI KALVEICTE 
TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

LTEE KI KALVEICTE 
CTG ACC GAG GAG AAG ATC AAG GCC CTG GTG GAG ATC TGC ACC GAG 
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ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 
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ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 
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ATG GAG AAG GAG GGC AAG ATC TCC AAG ATC GGC CCC GAG AAC CCC 
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Y NTPVFAI KKKDSTK 
TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 

YNTPVFAIKKKDSTK 
TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 

YNTPVFAIKKKDSTK 
TAC AAC ACC CCC GTG TTC GCC ATC AAG AAG AAG GAC TCC ACC AAG 
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TGG CGC AAG CTG GTG GAC TTC CGC GAG CTG AAC AAG CGC ACC CAG 
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DFWEVQ LGI PHPAGL 
GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 

DFWEVQ LGI PHPAGL 
GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 
DFWEVQ LGI PHPAGL 
3332 GAC TTC TGG GAG GTG CAG CTG GGC ATC CCC CAC CCC GCC GGC CTG 
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3332 
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KQKKSVTVLDVGDAY 
AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

KQKKSVTVLDVGDAY 
AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

KQKKSVTVLDVGDAY 
AAG CAG AAG AAG TCC GTG ACC GTG CTG GAC GTG GGC GAC GCC TAC 
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Alignment Report of Codon Optimization (pol).MEG, using Clustai method with PAM250 residue weight table. 
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YNVLPQGWKGSP A I F~ 
TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

YNVLPQGWKGSPAIF 
TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

YNVLPQGWKGSPAIF 
TAC AAC GTG CTG CCC CAG GGC TGG AAG GGC TCC CCC GCC ATC TTC 
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QCSMTKILEP'FRKQN 
CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 

QCSMTKILEPFRKQN 
CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 

QCSMTKILEPFRKQN 
CAG TGC TCC ATG ACC AAG ATC CTG GAG CCC TTC CGC AAG CAG AAC 
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Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 
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3170 
I 



1 

3200 
I 



3167 RQHLLRWGFTTPDKK 

3167 AGA CAA CAT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAA 
3165 RQHLLRWGFTTPDKK 

3165 AGA CAA CAT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAA 
3692 RQHLLRWGFTTPDKK 

3692 CGC CAG CAC CTG CTG CGC TGG GGC TTC ACC ACC CCC GAC AAG AAG 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 
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CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA CTC CAT 
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CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA CTC CAT 



H 
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M 
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CAC CAG AAG GAG CCC CCC TTC CTG TGG ATG GGC TAC GAG CTG CAC 



1 

3260 
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— I 

3290 
l 



PDKWTVQPIVLPEKD 
CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 

PDKWTVQPIVLPEKD 
CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 

PDKWTVQPIVLPEKD 
CCC GAC AAG TGG ACC GTG CAG CCC ATC GTG CTG CCC GAG AAG GAC 



1 

3320 
i 



SWTVNDIQKLVGKLN 
AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 

SWTVNDIQKLVGKLN 
AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 

SWTVNDIQKLVGKLN 
TCC TGG ACC GTG AAC GAC ATC CAG AAG CTG GTG GGC AAG CTG AAC 



1 

3350 
) 



3380 



WASQIYAGIKVRQLC 
TGG GCA AGT CAG ATT TAT GGA GGG ATT AAA GTA AGG CAA TTA TGT 

WASQIYAGIKVRQLC 
TGG GCA AGT CAG ATT TAT GCA GGG ATT AAA GTA AGG CAA TTA TGT 

WASQIYAGIKVRQLC 
TGG GCC TCC CAG ATC TAC GCC GGC ATC AAA GTC CGC CAG CTG TGC 



1 

3410 
I 



KLLR GTKALTEVVPL 
AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 

KLLRGTKALTEVVPL 
AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 

KLLRGTKALTEVVPL 
AAG CTG CTG CGC GGC ACC AAG GCC CTG ACC GAG GTG GTG CCC CTG 
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Alignment Report of Codon Optimization (poi).MEG, using Clustal method with PAM250 residue weight table. 



\ i — — 

3440 ' 3470 
1 I 

3437 TEEAELELAENREIL NL4-3 genbank.SEQ 

3437 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 

3435 TEEAELELAENREIL pNL4-3.seq 

3435 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 

3962 TEEAELELAENREIL pHDMHgpm2 . seq 

3962 ACC GAG GAG GCC GAG CTG GAG CTG GCC GAG AAC CGC GAG ATC CTG 

3500 

I 

3482 KEPVHGVYYDPSKDL NL4-3 genbank.SEQ 

34 82 AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

3480 KEPVHGVYYDPSKDL pNL4-3.seq 

34 80 AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

4007 KEPVHGVYYDPSKDL pHDMHgpm2 . seq 
4007 AAG GAG CCC GTG CAC GGC GTG TAC TAC GAC CCC TCC AAG GAC CTG 

T— 1 

3530 3560 
1 I 

3527 IAEIQKQGQGQWTYQ NL4-3 genbank.SEQ 

3527 ATA GCA GAA ATA CAG AAG CAG GGG CAA GGC CAA TGG ACA TAT CAA 

3525 IAEIQKQGQGQWTYQ pNL4-3.seq 

3525 ATA GCA GAA ATA CAG AAG CAG GGG CAA GGC CAA TGG ACA TAT CAA 

4052 IAEIQKQGQGQWTYQ P HDMHgpm2 . seq 

4 052 ATC GCC GAG ATC CAG AAG CAG GGC CAG GGC CAG TGG ACC TAC CAG 

1 ■ 

3590 

1 

3572 IYQEPFKNLKTGKYA NL4-3 genbank.SEQ 

3572 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 

3570 IYQEPFKNLKTGKYA pNL4-3.seq 

3570 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 

4097 IYQEPFKNLKTGKYA pHDMHgpm2 . seq 
4097 ATC TAC CAG GAG CCC TTC AAG AAC CTG AAG ACC GGC AAA TAC GCC 

1 1 — 

3620 3650 
1 I 

3617 RMKGAHTN DVKQLT E NL4-3 genbank.SEQ 

3 617 AGA ATG AAG GGT GCC CAC ACT AAT GAT GTG AAA CAA TTA ACA GAG 

3615 RMKGAHTN DVKQLTE pNL4-3.seq 

3615 AGA ATG AAG GGT GCC CAC ACT AAT GAT GTG AAA CAA TTA ACA GAG 

4142 RMKGAHTNDVKQLTE pHDMHgpm2 . seq 

4142 CGC ATG AAG GGC GCC CAC ACC AAC GAC GTG AAG CAG CTG ACC GAG 

1 — — ■ 

3680 

I 

3662 AVQKIATESIVIWGK NL4-3 genbank.SEQ 

3662 GCA GTA CAA AAA ATA GCC ACA GAA AGC ATA GTA ATA TGG GGA AAG 

3660 AVQKIATESIVIWGK pNL4-3.seq 

3660 GCA GTA CAA AAA ATA GCC ACA GAA AGC ATA GTA ATA TGG GGA AAG 

4187 AVQKIATESIVIWGK pHDMHgpm2 . seq 
4187 GCC GTG CAG AAG ATC GCC ACC GAG TCC ATC GTG ATC TGG GGC AAG 
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Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 
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KEPI I GAET FYVDGA 
AAA GAA CCC ATA ATA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 

K E P I I GAET FYVDGA 
AAA GAA CCC ATA ATA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 

KEPIIGAET FYVDGA 
AAG GAG CCC ATC ATC GGC GCC GAG ACC TTC TAC GTG GAC GGC GCC 
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3977 
3977 
3975 
3975 
4502 



3980 
I 



1 

4010 
I 



K 



H 



D 



AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 



K 



H 



AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 



K 



H 



4502 AAG ACC GAG CTG CAG GCC ATC CAC CTG GCC CTG CAA GAC TCC GGC 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



4022 
4022 
4020 
4020 
4547 
4547 



4067 
4067 
4065 
4065 
4592 



4157 
4157 
4155 
4155 
4682 



1 

4040 
i 



4202 
4202 
4200 
4200 
4727 



LEVNIVT DSQYALGI 
TTA GAA GTA AAC ATA GTG ACA GAC TCA CAA TAT GCA TTG GGA ATC 

LEVNIVT DSQYALGI 
TTA GAA GTA AAC ATA GTG ACA GAC TCA CAA TAT GCA TTG GGA ATC 

LEVNIVT DSQYALGI 
CTG GAG GTG AAC ATC GTG ACC GAC TCC CAG TAT GCA TTG GGC ATC 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



— I 

4070 
i 



1 

4100 
t 



D 



K 



V 



Q NL4-3 genbank.SEQ 



ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC AGT CAA 



K 



V 



Q pNL4-3.seq 



ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC AGT CAA 



K 



V 



Q pHDMHgpm2 . seq 



45 92 ATC CAG GCC CAG CCC GAC AAG TCC GAG TCC GAG CTG GTG TCC CAG 
! 

4130 

I 

4112 IIEQLI KKEKVYLAW NL4-3 genbank.SEQ 

4112 ATA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TGG 

4110 IIEQLI KKEKVYLAW pNL4-3.seq 

4110 ATA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TGG 

4 637 IIEQLIKKEKVYLAW pHDMHgpm2 . seq 

4 637 ATC ATC GAG CAG CTG ATC AAG AAG GAG AAG GTG TAC CTG GCC TGG 



1 

4160 
I 



1 — 

4190 
I 



V 



H 



K 



N 



V 



D 



GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT GGG 



V 



H 



K 



N 



V 



K 



GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT AAG 



V 



H 



K 



N 



V 



D 



K 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



4 682 GTG CCC GCC CAC AAG GGC ATC GGC GGC AAC GAG CAG GTG GAC AAG 



1 

4220 



V 



K 



V 



TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT 
LVSAGI RKVLFLD 

TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT 
LVSAGIRKVLFLD 



G I 
GGA ATA 

G I 
GGA ATA 

G I 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



4727 CTG GTG TCC GCC GGC ATC CGC AAG GTG CTG TTC CTG GAC GGC ATC 
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4247 
4247 
4245 
4245 
4772 
4772 



4292 
4292 
4290 
4290 
4817 
4817 



4337 
4337 
4335 
4335 
4862 
4862 



4382 
4382 
4380 
4380 
4907 



4427 
4427 
4425 
4425 
4952 
4952 



1 — 

4250 
» 



1 

4280 



DKAQEEHEKYHSNWR 
GAT AAG GCC CAA GAA GAA CAT GAG AAA TAT CAC AGT AAT TGG AGA 

DKAQEEHEKYHSNWR 
GAT AAG GCC CAA GAA GAA CAT GAG AAA TAT CAC AGT AAT TGG AGA 

DKAQEEHEKYHSNWR 
GAC AAG GCC CAG GAG GAG CAC GAG AAG TAC CAC TCC AAC TGG CGC 



1 

4310 
I 



M 



D 



N 



V 



V 



K 



GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 



M 



N 



V 



V 



K 



GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 



M 



N 



V 



V 



K 



GCC ATG GCC TCC GAC TTC AAC CTG CCC CCC GTG GTG GCC AAG GAG 



1 — 

4340 
i 



1 

4370 



IVAS CDKCQLKGEAM 
ATA GTA GCC AGC TGT GAT AAA TGT CAG CTA AAA GGG GAA GCC ATG 

IVASCDKCQLKGEAM 
ATA GTA GCC AGC TGT GAT AAA TGT CAG CTA AAA GGG GAA GCC ATG 

IVASCDKCQLKGEAM 
ATC GTG GCC TCC TGC GAC AAG TGC CAG CTG AAG GGC GAG GCC ATG 



1 

4400 
I 



H 



V 



W 



CAT GGA CAA GTA GAC TGT AGC CCA GGA ATA TGG CAG CTA GAT TGT 



H 



V 



W 



CAT GGA CAA GTA GAC TGT AGC CCA GGA ATA TGG CAG CTA GAT TGT 



H 



V 



D 



W 



D 



4907 CAC GGC CAG GTG GAC TGC TCC CCC GGC ATC TGG CAG CTG GAC TGC 



1 

4430 
■ 



1 

4460 
i 



THLEGKVI L V A V H V A 
ACA CAT TTA GAA GGA AAA GTT ATC TTG GTA GCA GTT CAT GTA GCC 

THLEGKVI L V A V H V A 
ACA CAT TTA GAA GGA AAA GTT ATC TTG GTA GCA GTT CAT GTA GCC 

THLEGKVI L V A V H V A 
ACC CAC CTG GAG GGC AAG GTG ATC CTG GTG GCC GTG CAC GTG GCC 



1 

4490 
l 



4472 
4472 
4470 
4470 
4997 
4997 



S 
AGT 
S 



GYIEAEVI PAETGQ 
GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 



V 



A 



AGT GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 

S GYI EAEVI PAETGQ 
TCC GGC TAC ATC GAG GCC GAG GTG ATC CCC GCC GAG ACC GGC CAG 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank . SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 
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— j — 

4520 
i 



— i — 
4550 



4517 
4517 
4515 
4515 
5042 
5042 



K 



W 



V 



GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 



K 



A 



W 



V 



GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 



K 



W 



GAG ACC GCC TAC TTC CTG CTG AAG CTG GCC GGC CGC TGG CCC 



V 
GTG 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 • seq 



4580 
I 



4562 
4562 
4560 
4560 
5087 
5087 



KTVHTDN GSNFTSTT 
AAA ACA GTA CAT ACA GAC AAT GGC AGC AAT TTC ACC AGT ACT ACA 

KTVHTDNGSNFTSTT 
AAA ACA GTA CAT ACA GAC AAT GGC AGC AAT TTC ACC AGT ACT ACA 

KTVHTDNGSNFTSTT 
AAG ACC GTG CAC ACC GAC AAC GGC TCC AAC TTC ACC TCC ACC ACC 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



— J — 

4610 
l 



— I 

4640 
1 



4607 
4607 
4605 
4 605 
5132 



VKAACWWAGIKQEFG NL4-3 genbank.SEQ 
GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CAG GAA TTT GGC 

VKAACWWAGIKQEFG pNL4-3.seq 
GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CAG GAA TTT GGC 

VKAACWWAGIKQEFG pHDMHgpm2 . seq 



5132 GTG AAG GCC GCC TGC TGG TGG GCC GGC ATC AAG CAG GAG TTC GGC 



1 

4670 
I 



4652 
4652 
4650 
4650 
5177 



N 



V 



M 



N 



ATT CCC TAC AAT CCC CAA AGT CAA GGA GTA ATA GAA TCT ATG AAT 



N 



V 



M 



N 



ATT CCC TAC AAT CCC CAA AGT CAA GGA GTA ATA GAA TCT ATG AAT 



N 



V 



M 



N 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



5177 ATC CCC TAC AAC CCC CAG TCC CAG GGC GTG ATC GAG TCC ATG AAC 



1 

4700 



j — 

4730 
i 



4697 
4697 
4695 
4 695 
5222 



K 



K 



K 



V 



D 



E NL4-3 genbank.SEQ 



AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GAT CAG GCT GAA 



K 



K 



K 



V 



D 



pNL4-3 . seq 



AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GAT CAG GCT GAA 



K 



K 



K 



V 



D 



E pHDMHgpm2 . seq 



5222 AAG GAG CTG AAG AAG ATC ATC GGC CAA GTC CGC GAC CAG GCC GAG 



1 

4760 
i 



4742 
4742 
4740 
4740 
5267 



HLKTAVQMAVFI HNF NL4-3 genbank.SEQ 
CAT CTT AAG ACA GCA GTA CAA ATG GCA GTA TTC ATC CAC AAT TTT 

HLKTAVQMAVFIHNF pNL4-3.seq 
CAT CTT AAG ACA GCA GTA CAA ATG GCA GTA TTC ATC CAC AAT TTT 



H 



K 



V 



M 



V 



H 



N 



F pHDMHgpm2 . seq 



5267 CAC CTG AAG ACC GCC GTG CAG ATG GCC GTG TTC ATC CAC AAC TTC 
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Q 

"Ogf 

i =1 



; s 3 



4787 
4787 
4785 
4785 
5312 
5312 



4832 
4832 
4830 
4830 
5357 
5357 



4877 
4877 
4875 
4875 
5402 
5402 



1 

4790 
I 



i 

4820 
i 



KRKGGI GGYSAGERI 

AAA AGA AAA GGG GGG ATT GGG GGG TAC AGT GCA GGG GAA AGA ATA 

KRKGGI GGYSAGERI 

AAA AGA AAA GGG GGG ATT GGG GGG TAC AGT GCA GGG GAA AGA ATA 

KRKGGI GGYSAGERI 

AAG CGC AAG GGC GGC ATC GGC GGC TAC TCC GCC GGC GAG CGC ATC 



1 

4850 
i 



VDI IATDIQTKELQK 
GTA GAC ATA ATA GCA ACA GAC ATA CAA ACT AAA GAA TTA CAA AAA 

VDI IATDIQTKELQK 
GTA GAC ATA ATA GCA ACA GAC ATA CAA ACT AAA GAA TTA CAA AAA 

VDI IATDIQTKELQK 
GTG GAC ATC ATC GCC ACC GAC ATC CAG ACC AAG GAG CTG CAG AAG 



1 

4880 
I 



1 

4910 
I 



Q I T K I 
CAA ATT ACA AAA ATT 

Q I T K I 
CAA ATT ACA AAA ATT 

Q I T K I 



QNFRVYYRDS 
CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

QNFRVYYRDS 
CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

QNFRVYYRDS 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



CAG ATC ACC AAG ATC CAG AAC TTC CGC GTG TAC TAC CGC GAC TCC 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



4940 
' 



• &\ 
•■.it . 



4922 
4922 
4920 
4920 
5447 
5447 



4967 
4967 
4965 
4965 
5492 
5492 



5012 
5012 
5010 
5010 
5537 
5537 



RD PVWKG PAKLLWKG 
AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

RDPVWKGPAKLLWKG 
AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

RDPVWKGPAKLLWKG 
CGC GAC CCC GTG TGG AAG GGC CCC GCC AAG CTG CTG TGG AAG GGC 



1 

4970 
t 



1 

5000 
i 



EGAVVI QDNSDI KVV 
GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

EGAVVI QDNSDI KVV 
GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

EGAVVI QDNS DI KVV 
GAG GGC GCC GTG GTG ATC CAG GAC AAC TCC GAC ATC AAG GTG GTG 



1 

5030 



PRRKAKI I RDYGKQM 
CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

PRRKAKI I RDYGKQM 
CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

PRRKAKI I RDYGKQM 
CCC CGC CGC AAG GCC AAG ATC ATC CGC GAC TAC GGC AAG CAG ATG 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 .seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 
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5057 
5057 
5055 
5055 
5582 
5582 



— i — 

5060 
i 



— i — 

5090 
I 



AGDDCVASRQDED 
GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT $AA 

AGDDCVASRQDED 
GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT IpEAA 

AGDDCVASRQDED WS$ 
GCC GGC GAC GAC TGC GTG GCC TCC CGC CAG GAC GAG GAC '"tAA: 



NL4-3 genbank.SEQ 



pNL4-3 . seq 
pHDMHgpm2 . s eq 



Fig. 9L 



as. 

'st*J 



AGCTTGGCCC 

GTCCAACATT 

CGGGGTCATT 

GCCCGCCTGG 

CCATAGTAAC 

CTGCCCACTT 

ATGACGGTAA 

CTTGGCAGTA 

ACATCAATGG 

ACGTCAATGG 

ACTCCGCCCC 

GAGCTCGTTT 

ATAGAAGACA 

GAGTC TATGG 

TAGGAAGGGG 

TAATTTTAAA 

TTTCCCTAAT 

ATTCTAAAGA 

AATATTTCTG 

CAATCCAGCT 

CCAAGCTAGG 

GGCAACGTGC 

GGCGCCCGCG 

CGCCCCGGCG 

GAGCGCTTCG 

GGCCAGCTGC 

ATCGCCGTGC 

G AC AAGATC G 

ACCGGCAACA 

ATGGTGCACC 

AAGGCCTTCT 

CAGGACCTGA 

AAGG AG AC C A 

CCCATCGCCC 

ACCCTGCAAG 

TACAAGCGCT 

ATCCTGGACA 

AAGACCCTGC 

CTGGTGCAGA 

ACCCTGGAGG 

GTGCTGGCCG 

AACTTCCGCA 

GCCAAGAACT 

CAGATGAAAG 

AAGGGAAGGC 

AGCTTCAGGT 

GAACTGTATC 



ATTGCATACG 

ACCGCCATGT 

AGTTCATAGC 

CTGACCGCCC 

GCCAATAGGG 

GGCAGTACAT 

ATGGCCCGCC 

CATCTACGTA 

GCGTGGATAG 

GAGTTTGTTT 

ATTGACGCAA 

AGTGAACCGT 

CCGGGACCGA 

GACCCTTGAT 

AGAAGTAACA 

AAATGCTTTC 

CTCTTTCTTT 

ATAACAGTGA 

CATATAAATT 

ACCATTCTGC 

CCCTTTTGCT 

TGGTCTGTGT 

CCTCCGTGCT 

GCAAGAAGCA 

CCGTGAACCC 

AGCCCTCCCT 

TGTACTGCGT 

AGGAGGAGCA 

ACTCCCAGGT 

AGGCCATCTC 

CCCCCGAAGT 

ACACCATGCT 

TCAACGAGGA 

CCGGCCAGAT 

AGCAGATCGG 

GGATCATCCT 

TCCGCCAGGG 

GCGCCGAGCA 

ACGCCAACCC 

AGATGATGAC 

AGGCCATGTC 

ACCAGCGCAA 

GCCGCGCCCC 

ATTGTACTGA 

CAGGGAATTT 

TTGGGGAAGA 

CTTTAGC TTC 



TTGTATCCAT 

TGACATTGAT 

CCATATATGG 

AACGACCCCC 

ACTTTCCATT 

CAAGTGTATC 

TGGCATTATG 

TTAGTCATCG 

CGGTTTGACT 

TGGCACC AAA 

ATGGGCGGTA 

CAGATCGCCT 

TCCAGCCTCC 

GTTTTCTTTC 

GGGTACACAT 

TTCTTTTAAT 

CAGGGC AATA 

TAATTTCTGG 

GTAACTGATG 

TTTT AT TTT A 

AATCATGTTC 

GCTGGCC CAT 

GTCCGGCGGC 

GTACAAGCTG 

CGGCCTGCTG 

GCAAACCGGC 

GCACCAGCGC 

GAACAAGTCC 

GTCCCAGAAC 

CCCCCGCACC 

CATCCCCATG 

GAACACCGTG 

GGCCGCCGAG 

GCGCGAGCCC 

CTGGATGACC 

GGGCCTGAAC 

CCCCAAGGAG 

GGCCTCCCAG 

CGACTGCAAG 

CGCCTGCCAG 

CCAAGTCACC 

GACCGTGAAG 

CCGCAAGAAG 

GAGACAGGCT 

TCTTCAGAGC 

GACAACAACT 

CCTCAGATCA 



ATCATAATAT 

TATTGACTAG 

AGTTCCGCGT 

GCCCATTGAC 

GACGTCAATG 

ATATGCCAAG 

C C CAGTAC AT 

CTATT AC CAT 

CACGGGGATT 

ATCAAC GGGA 

GGCGTGT AC G 

GGAGACGCCA 

CCTCGAAGCT 

CCCTTCTTTT 

ATTG AC C AAA 

ATACTTTTTT 

ATGATACAAT 

GTTAAGGCAA 

TAAGAGGTTT 

TGGTTGGGAT 

ATACCTCTTA 

CACTTTGGCA 

GAGCTGGACA 

AAGCACATCG 

GAG AC C T C CG 

TCCGAGGAGC 

ATCGACGTGA 

AAGAAGAAGG 

TACCCCATCG 

CTGAACGCCT 

TTCTCCGCCC 

GGCGGCCACC 

TGGGACCGCC 

CGCGGCTCCG 

CACAACCCCC 

AAGATCGTGC 

CCCTTCCGCG 

GAGGTAAAGA 

ACCATCCTGA 

GGCGTGGGCG 

AACCCCGCCA 

TGCTTCAACT 

GGCTGCTGGA 

AATTTTTTAG 

AGACCAGAGC 

CCCTCTCAGA 

CTCTTTGGCA 



GTACATTTAT 

TTATTAATAG 

TACATAACTT 

GTCAATAATG 

GGTGGAGTAT 

TACGCCCCCT 

GACCTTATGG 

GGTGATGCGG 

TCCAAGTCTC 

CTTTCCAAAA 

GTGGGAGGTC 

TCCACGCTGT 

GATCCTGAGA 

CTATGGTTAA 

TCAGGGTAAT 

GTTTATCTTA 

GTATCATGCC 

TAGCAATATT 

CATATTGCTA 

AAGGCTGGAT 

TCTTCCTCCC 

AAGAATTCTA 

AGTGGGAGAA 

TGTGGGCCTC 

AGGGCTGCCG 

TGCGCTCCCT 

AGGACACCAA 

CCCAGCAGGC 

TGCAGAACCT 

GGGTGAAGGT 

TGTCCGAGGG 

AGGCCGCCAT 

TGCACCCCGT 

ACATCGCCGG 

CCATCCCCGT 

GCATGTACTC 

ACTACGTGGA 

ACTGGATGAC 

AGGCCCTGGG 

GCCCCGGCCA 

CCATCATGAT 

GCGGCAAGGA 

AGTGCGGCAA 

GGAAGATCTG 

CAACAGCCCC 

AGCAGGAGCC 

GCGACCCCTC 



ATTGGCTCAT 


60 


TAATCAATTA 


120 


ACGGTAAATG 


180 


ACGTATGTTC 


240 


TTAC GGT AAA 


300 


ATTGACGTCA 


360 


GACTTTCCTA 


420 


TTTTGGCAGT 


480 


CACCCCATTG 


540 


TGTCGTAACA 


a A 

600 


TATATAAGCA 


660 


TTTGACCTCC 


720 


ACTTCAGGGT 


780 


GTTCATGTCA 


840 


TTTGCATTTG 


900 


TTTCTAATAC 


960 


TCTTTGCACC 


1020 


TCTGCATATA 


1080 


ATAGCAGCTA 


1140 


TATTCTGAGT 


1200 


ACAGCTCCTG 


1260 


GACTGCCATG 


1320 


GATCCGCCTG 


1380 


CCGCGAGCTG 


1440 


CCAGATCCTG 


1500 


GTACAACACC 


1560 


GGAGGCCCTG 


1620 


CGCCGCCGAC 


1680 


GCAGGGCCAG 


1740 


GGTGGAGGAG 


1800 


CGCCACCCCC 


1860 


GCAGATGCTG 


1920 


GCACGCCGGC 


1980 


CACCACCTCC 


2040 


GGGCGAGATC 


2100 


CCCCACCTCC 


2160 


CCGCTTCTAC 


2220 


CGAGACCCTG 


2280 


CCCCGGCGCC 


2340 


CAAGGCCCGC 


2400 


CCAGAAGGGC 


2460 


GGGCCACATC 


2520 


GGAGGGCCAC 


2580 


GCCTTCCCAC 


2640 


ACCAGAAGAG 


2700 


GATAGACAAG 


2760 


GTCACAATAA 


2820 



Fig. 10A 



■ ~ IS" 



>-3 



.: 3 



AGATCGGTGG 
AGGAGATGAA 
TCAAAGTCCG 
CCGTGCTGGT 
GCTGCACCCT 
GCATGGACGG 
TGGAGATCTG 
CCTACAACAC 
TGGACTTCCG 
CCCACCCCGC 
ACTTCTCCGT 
TCAACAACGA 
GCTCCCCCGC 
ACCCCGACAT 
TCGGCCAGCA 
CCACCCCCGA 
ACCCCGACAA 
ACGACATCCA 
AAGTCCGCCA 
TGACCGAGGA 
ACGGCGTGTA 
GCCAGTGGAC 
CCCGCATGAA 
TCGCCACCGA 
AGGAGAC CTG 
AGTTCGTGAA 
TCGGCGCCGA 
CCGGCTACGT 
AGAAGACCGA 
TCGTGACCGA 
CCGAGCTGGT 
GGGTGCCCGC 
GCATCCGCAA 
ACCACTCCAA 
AGATC GTGGC 
ACTGCTCCCC 
TGGCCGTGCA 
AGGAGAC C GC 
CCGACAAC GG 
TCAAGCAGGA 
ACAAGGAGCT 
CCGTGCAGAT 
CCGCCGGCGA 
AGCAGATCAC 
GGAAGGGCCC 
CCGACATCAA 
TGGCCGGCGA 



CCAGCTGAAG 
CCTGCCCGGC 
CCAGTACGAC 
GGGCCCCACC 
GAACTTCCCC 
CCCCAAAGTC 
CACCGAGATG 
CCCCGTGTTC 
CGAGCTGAAC 
CGGCCTGAAG 
GCCCCTGGAC 
GACCCCCGGC 
CATCTTCCAG 
CGTGATCTAC 
CCGCACCAAG 
CAAGAAGCAC 
GTGGACCGTG 
GAAGCTGGTG 
GCTGTGCAAG 
GGCCGAGCTG 
CTACGACCCC 
CT AC C AGATC 
GGGCGCCCAC 
GTCCATCGTG 
GGAGGCC TGG 
CACCCCCCCC 
GACCTTCTAC 
GACCGACCGC 
GCTGCAGGCC 
CTCC C AGTAT 
GTCCCAGATC 
CCACAAGGGC 
GGTGCTGTTC 
CTGGCGCGCC 
CTCCTGCGAC 
CGGCATCTGG 
CGTGGCCTCC 
CTACTTCCTG 
CTCCAACTTC 
GTTCGGCATC 
GAAGAAGATC 
GGCCGTGTTC 
GCGCATCGTG 
C AAGATC C AG 
CGCCAAGCTG 
GGTGGTGCCC 
CGACTGC GTG 



GAGGCCCTGC 

CGCTGGAAGC 

CAGATCCTGA 

CCCGTGAACA 

ATCTCCCCCA 

AAGCAGTGGC 

GAGAAGGAGG 

GCCATCAAGA 

AAGCGCACCC 

CAGAAGAAGT 

AAGGACTTCC 

ATCCGCTACC 

TGCTCCATGA 

CAGTACATGG 

ATCGAGGAGC 

CAGAAGGAGC 

CAGCCCATCG 

GGCAAGCTGA 

CTGCTGCGCG 

GAGCTGGCCG 

TCCAAGGACC 

T AC C AGGAGC 

ACCAACGACG 

ATCTGGGGCA 

TGGACCGAGT 

CTGGTGAAGC 

GTGGACGGCG 

GGCCGCCAGA 

ATCCACCTGG 

GCATTGGGCA 

ATCGAGGAGC 

ATCGGCGGCA 

CTGGACGGCA 

ATGGCCTCCG 

AAGTGCCAGC 

CAGCTGGACT 

GGCTACATCG 

CTGAAGCTGG 

ACCTCCACCA 

CCCTACAACC 

ATC GGCCAAG 

ATCCACAACT 

GACATCATCG 

AACTTCCGCG 

CTGTGGAAGG 

CGCCGCAAGG 

GCCTCCCGCC 



TGGACAC C GG 

CCAAGATGAT 

TCGAGATCTG 

TCATCGGCCG 

TCGAGACCGT 

CCCTGACCGA 

GCAAGATCTC 

AGAAGGACTC 

AGGACTTCTG 

CCGTGACCGT 

GC AAGTACAC 

AGTACAACGT 

CCAAGATCCT 

ACGACCTGTA 

TGCGCCAGCA 

CCCCCTTCCT 

TGCTGCCCGA 

ACTGGGCCTC 

GCACCAAGGC 

AGAAC CGCGA 

TGATC GCCGA 

CCTTCAAGAA 

TGAAGCAGCT 

AGACTCCCAA 

ACTGGCAGGC 

TGTGGTACCA 

CCGCCAACCG 

AGGTGGTGCC 

CCCTGCAAGA 

TCATCCAGGC 

TGATCAAGAA 

ACGAGCAGGT 

TCGACAAGGC 

ACTTCAACCT 

TGAAGGGCGA 

GCACCCACCT 

AGGCCGAGGT 

CCGGCCGCTG 

CCGTGAAGGC 

CCCAGTCCCA 

TCCGCGACCA 

TCAAGCGCAA 

CCACCGACAT 

TGTACTACCG 

GCGAGGGCGC 

CC AAGATC AT 

AGGACGAGGA 



CGCCGACGAC 

CGGCGGCATC 

CGGCCACAAG 

CAACCTGCTG 

GCCCGTGAAG 

GGAGAAGATC 

CAAGATCGGC 

CACCAAGTGG 

GGAGGTGCAG 

GC TGGAC GTG 

CGCCTTCACC 

GCTGCCCCAG 

GGAGCCCTTC 

CGTGGGCTCC 

CCTGCTGCGC 

GTGGATGGGC 

GAAGGACTCC 

C C AGATC TAC 

CCTGACCGAG 

GATC CTGAAG 

GATCCAGAAG 

CCTGAAGACC 

GACCGAGGCC 

GTTCAAGCTG 

C AC C TGG ATC 

GC TGGAGAAG 

CGAGACCAAG 

CCTGACCGAC 

CTCCGGCCTG 

CCAGCCCGAC 

GG AGAAGGT G 

GGACAAGCTG 

CCAGGAGGAG 

GCCCCCCGTG 

GGCCATGCAC 

GGAGG GC AAG 

GATCCCCGCC 

GCCCGTGAAG 

CGCCTGCTGG 

GGGCGTGATC 

GGCCGAGCAC 

GGGCGGCATC 

CCAGACCAAG 

CGACTCCCGC 

C GTGGTG ATC 

CCGCGACTAC 

CTAACACATG 



ACCGTGCTGG 2880 
GGCGGCTTCA . 2940 

GC CATC GGC A 3000 

ACCCAGATCG 3 060 

CTGAAGCCCG 3120 

AAGGCCCTGG 3180 

CCCGAGAACC 3240 

CGCAAGCTGG 33 00 

CTGGGCATCC 33 60 

GGCGACGCC T 342 0 

ATCCCCTCCA 3480 

GGC TGGAAGG 3540 

CGCAAGC AGA 3 600 

GACCTGGAGA 3 660 

TGGGGCTTCA 372 0 

TACGAGCTGC 37 80 

TGGACCGTGA 3 840 

GCCGGCATCA 39 00 

GTGGTGCCCC 3 960 

GAGCCCGTGC 402 0 

CAGGGCCAGG 4080 

GGCAAAT AC G 4140 

GTGCAGAAGA 42 00 

CCCATCCAGA 42 60 

CCCGAGTGGG 432 0 

GAGC C CATC A 43 80 

CTGGGCAAGG 4440 

ACCACCAACC 4500 

GAGGTGAAC A 4560 

AAGTCCGAGT 4620 

TACCTGGCCT 4680 

GTGTCCGCCG 4740 

C AC GAGAAGT 4800 

GTGGCCAAGG 4860 

GGCCAGGTGG 4920 

GTGATCCTGG 4980 

GAGACCGGCC 5 040 

ACCGTGCACA 510 0 

TGGGCCGGCA 5160 

GAGTCCATGA 5220 

CTGAAGACCG 5280 

GGCGGCTACT 5340 

GAGC TGC AGA 540 0 

GACCCCGTGT 5460 

CAGGACAACT 5520 

GGCAAGCAGA 5580 

GAAAAGATTA 5640 
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GTAAAACACC 
CCCAGATCTA 
CTAATGCCCT 
AGGTTCCTTT 
CATCTGGATT 
TTCTGAATAT 
TGAAGAGCTA 
GAGGCTGCAA 
CCTCAGAAAA 
TTTACATTAC 
TCCTGCATCT 
CTGAAGTGTT 
TTAGTTGTCT 
TTGACTGTCC 
CGACATGGCA 
TAGGTTAATG 
GTGCGCGGAA 
AGACAATAAC 
CATTTCCGTG 
CCAGAAACGC 
ATCGAACTGG 
C C AATGATG A 
GGGCAAGAGC 
CCAGTCACAG 
ATAACCATGA 
GAGCTAACCG 
CCGGAGCTGA 
GCAACAACGT 
TTAATAGACT 
GCTGGCTGGT 
GCAGCACTGG 
CAGGCAACTA 
CATTGGTAAC 
TTTTAATTTA 
TAACGTGAGT 
TGAGATCCTT 
GCGGTGGTTT 
AGCAGAGCGC 
AAGAACTCTG 
GCCAGTGGCG 
GCGCAGCGGT 
TACACCGAAC 
AGAAAGGCGG 
CTTCCAGGGG 

gagcgtcgat 
ggatgcgccg 
tggtttgcgc 



ATAGGCCGCT 
ATTCACCCCA 
GGCCCACAAG 
GTTCCCTAAG 
CTGCCTAATA 
TTTACTAAAA 
GTTCAAACCT 
ACAGCTAATG 
GGATTCAAGT 
TTATTGTTTT 
CTCAGCCTTG 
CCTTCCATGT 
CTGTTGTCTT 
TGTGAGCCCT 
GTC TAGATC A 
TCATGATAAT 
CCCCTATTTG 
CCTGATAAAT 
TCGCCCTTAT 
TGGTGAAAGT 
ATCTCAACAG 
GCACTTTTAA 
AACTCGGTCG 
AAAAGCATCT 
GTGATAACAC 
CTTTTTTGCA 
ATGAAGCCAT 
TGC GCAAACT 
GGATGGAGGC 
TTATTGCTGA 
GGCCAGATGG 
TGGATGAACG 
TGTCAGACCA 
AAAGGATCTA 
TTTCGTTCCA 
TTTTTCTGCG 
GTTTGCCGGA 
AGATAC C AAA 
TAGCACCGCC 
ATAAGTCGTG 
CGGGC TGAAC 
TGAGATACCT 
ACAGGTATCC 
GAAACGCCTG 
TTTTGTGATG 
CGTGCGGCTG 
ATTCACAGTT 



CTAGAGGATC 

CCAGTGCAGG 

TATCACTAAG 

TCCAACTACT 

AAAAACATTT 

AGGGAATGTG 

TGGGAAAATA 

CACATTGGCA 

AGAGGC TTGA 

AGCTGTCCTC 

ACTCCACTCA 

TTTACGGCGA 

AT AG AGGTC T 

TCTTCCCTGC 

TTCTTGAAGA 

AATGGTTTCT 

TTT ATTTTT C 

GCTTCAATAA 

TCCCTTTTTT 

AAAAGATGCT 

CGGTAAGATC 

AGTTCTGCTA 

CCGCATACAC 

TACGGATGGC 

TGCGGCCAAC 

CAACATGGGG 

AC C AAAC G AC 

ATTAACTGGC 

GGATAAAGTT 

TAAATCTGGA 

TAAGCCCTCC 

AAATAGACAG 

AGTTTACTCA 

GGTGAAGATC 

CTGAGCGTCA 

CGTAATCTGC 

TCAAGAGCTA 

TACTGTTCTT 

TACATACCTC 

TCTTACCGGG 

GGGGGGTTCG 

AC AGC GTGAG 

GGTAAGCGGC 

GTATCTTTAT 

CTCGTCAGGG 

CTGGAGATGG 

CTCCGCAAGA 



CAAGCTTATC 

CTGCCTATCA 

CTCGCTTTCT 

AAACTGGGGG 

ATTTTCATTG 

GGAGGTCAGT 

C AC T ATATCT 

ACAGCCCCTG 

TTTGGAGGTT 

ATGAATGTCT 

GTTCTCTTGC 

GATGGTTTCT 

ACTTGAAGAA 

CTCCCCCACT 

CGAAAGGGCC 

TAGACGTCAG 

TAAATACATT 

TATTGAAAAA 

GCGGCATTTT 

GAAGATCAGT 

CTTGAGAGTT 

TGTGGCGCGG 

TATTCTCAGA 

ATGACAGTAA 

TTACTTCTGA 

GATCATGTAA 

GAGC GTGAC A 

GAACTACTTA 

GCAGGACCAC 

GCCGGTGAGC 

CGTATCGTAG 

ATCGCTGAGA 

TATAT AC TTT 

CTTTTTGATA 

GACCCCGTAG 

TGCTTGCAAA 

CCAACTCTTT 

CTAGTGTAGC 

GCTCTGCTAA 

TTGGACTCAA 

TGCACACAGC 

C TAT GAGAAA 

AGGGTCGGAA 

AGTCCTGTCG 

GGGCGGAGCC 

CGGACGCGAT 

ATTGATTGGC 



GATACC GTC G 

GAAAGTGGTG 

TGCTGTCCAA 

ATATTATGAA 

CAATGATGTA 

GCATTTAAAA 

TAAACTCCAT 

ATGCCTATGC 

AAAGTTTTGC 

TTTCAC TACC 

TTAGAGATAC 

CCTCGCCTGG 

GGAAAAACAG 

CACAGTGACC 

TCGTGATACG 

GTGGCACTTT 

CAAATATGTA 

GGAAGAGTAT 

GCCTTCCTGT 

TGGGTGCACG 

TTCGCCCCGA 

TATTATCCCG 

ATGACTTGGT 

GAGAATTATG 

CAACGATCGG 

CTCGCCTTGA 

CCACGATGCC 

CTCTAGCTTC 

TTCTGCGCTC 

GTGGGTCTCG 

TTATCTACAC 

TAGGTGCCTC 

AGATTGATTT 

ATCTCATGAC 

AAAAGATCAA 

CAAAAAAACC 

TTCCGAAGGT 

CGTAGTTAGG 

TCCTGTTACC 

G AC GATAGTT 

CCAGCTTGGA 

GCGCCACGCT 

CAGGAGAGCG 

GGTTTCGCCA 

TATGGAAAAA 

GGATATGTTC 

TCCAATTCTT 



ACCTCGAGGG 


5700 


GCTGGTGTGG 


5760 


TTTCTATTAA 


5820 


GGGCCTTGAG 


5880 


TTTAAATTAT 


5940 


CATAAAGAAA 


6000 


GAAAGAAGGT 


6060 


CTTATTCATC 


6120 


TATGCTGTAT 


6180 


CATTTGCTTA 


6240 


CACCTTTCCC 


6300 


CCACTCAGCC 


6360 


GGGGCATGGT 


6420 


CGGAATCCCT 


6480 


CCTATTTTTA 


6540 


TCGGGGAAAT 


6600 


TCCGCTCATG 


6660 


GAGTATTCAA 


6720 


TTTTGCTCAC 


6780 


AGTGGGTTAC 


6840 


AGAACGTTTT 


6900 


TATTGACGCC 


6960 


TGAGTACTCA 


7020 


CAGTGCTGCC 


7080 


AGGACCGAAG 


7140 


TCGTTGGGAA 


7200 


TGTAGCAATG 


7260 


CCGGCAACAA 


7320 


GGCCCTTCCG 


7380 


CGGTATCATT 


7440 


GACGGGGAGT 


7500 


AC TG ATTAAG 


7560 


AAAACTTCAT 


7620 


CAAAATCCCT 


7680 


AGGATCTTCT 


7740 


ACCGCTACCA 


7800 


AACTGGCTTC 


7860 


CCACCACTTC 


7920 


AGTGGCTGCT 


7980 


AC C GGATAAG 


8040 


GCGAACGACC 


8100 


TCCCGAAGGG 


8160 


CACGAGGGAG 


8220 


CCTCTGACTT 


8280 


CGCCAGCAAC 


8340 


TGCCAAGGGT 


8400 


GGAGTGGTGA 


8460 
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ATCCGTTAGC GAGGTGCCGC CGGCTTCCAT TCAGGTCGAG GTGGCCCGGC TCCATGCACC 8520 

GCGACGCAAC GCGGGGAGGC AGACAAGGTA TAGGGCGGCG CCTACAATCC ATGCCAACCC 8580 

GTTCCATGTG CTCGCCGAGG CGGCATAAAT CCCCGTGACG ATCAGCGGTC CAATGATCGA 8640 

AGTTAGGCTG GTAAGAGCCG CGAGCGATCC TTGAAGCTGT CCCTGATGGT CGTCATC TAC 8700 

CTGCCTGGAC AGCATGGCCT GCAACGCGGG CATCCCGATG CCGCCGGAAG CGAGAAGAAT 87 60 

CATAATGGGG AAGGCCATCC AGCCTCGCGT CGGGGAGCTT TTTGCAAAAG CCTAGGCCTC 882 0 

CAAAAAAGCC TCCTCACTAC TTCTGGAATA GCTCAGAGGC CGAGGCGGCC TCGGCCTCTG 8880 
CATAAATAAA AAAAATTAGT CAGCCATG 8908 
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